{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Quickstart Guide: \n",
    "\n",
    "This Quickstart Guide presents a simple example of **ocean data challenge** for mapping the Sea Surface Height from sparse observations. \n",
    "\n",
    "The methodology is based on an Observing System Simulation Experiment (OSSE). The inputs data represent altimeter observations extracted from a realistic high-resolution ocean model simulation (NATL60). A simple mapping algorithm (Optimal Interpolation) is used to produce the reconstructed SSH field from the sparse observations. Finally, a comparison between the reconstructed and the reference SSH fields is done to quantify the reconstruction scores.\n",
    "\n",
    "Three experiments are carried out: \n",
    "    \n",
    ">   A) **Experiment 1**: demo. of reconstruction with **1 nadir altimeter**\n",
    "\n",
    ">   B) **Experiment 2**: demo. of reconstruction with **4 nadirs altimeter**\n",
    "\n",
    ">   C) **Experiment 3**: demo. of reconstruction with **1 SWOT altimeter**\n",
    "\n",
    "The notebook is structured as follows:\n",
    "\n",
    "    1) downloading the data\n",
    "    2) Setup configuration of the interpolation\n",
    "    3) Run the experiments\n",
    "    4) Plot the reconstruction scores for each experiment\n",
    "\n",
    "This quickstart guide take approx. 30 min to run on a PC. "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "import xarray as xr\n",
    "import numpy\n",
    "import hvplot.xarray\n",
    "import pyinterp\n",
    "import dask\n",
    "import warnings\n",
    "import xrft\n",
    "import logging\n",
    "import pandas as pd\n",
    "warnings.filterwarnings('ignore')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### libraries versions"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "print('xarray', xr.__version__)\n",
    "print('numpy', numpy.__version__)\n",
    "print('hvplot', hvplot.__version__)\n",
    "print('pyinterp', pyinterp.__version__)\n",
    "print('dask', dask.__version__)\n",
    "print('logging', logging.__version__)\n",
    "print('xrft', xrft.__version__)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "logger = logging.getLogger()\n",
    "logger.setLevel(logging.INFO)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "cluster = dask.distributed.LocalCluster()\n",
    "client = dask.distributed.Client(cluster)\n",
    "client"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "from src.mod_oi import *\n",
    "from src.mod_inout import *\n",
    "from src.mod_regrid import *\n",
    "from src.mod_eval import *\n",
    "from src.mod_plot import *"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 1- DOWNLOADING DATA"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Nature run SSH for mapping evaluation"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%time\n",
    "import gcsfs\n",
    "fs = gcsfs.GCSFileSystem('pangeo-181919', requester_pays=True)\n",
    "mapfilesref = fs.get_mapper('pangeo-meom/data-challenge-test/dc_ref')\n",
    "dc_ref = xr.open_zarr(mapfilesref)\n",
    "\n",
    "dc_ref"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Synthetic SSH observation for OI mapping"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!wget https://ige-meom-opendap.univ-grenoble-alpes.fr/thredds/fileServer/meomopendap/extract/ocean-data-challenges/dc_data1/dc_obs.tar.gz"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "!tar -xvf dc_obs.tar.gz"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 2- SETUP CONFIGURATION "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "# OI Grid\n",
    "lon_min = -65.\n",
    "lon_max = -55.\n",
    "lat_min = 33.\n",
    "lat_max = 43.\n",
    "time_min = numpy.datetime64('2012-10-22')\n",
    "time_max = numpy.datetime64('2012-12-02')     \n",
    "dx = 0.2                                                 # zonal grid spatial step (in degree)\n",
    "dy = 0.2                                                 # meridional grid spatial step (in degree)\n",
    "dt = numpy.timedelta64(1, 'D')                           # temporal grid step\n",
    "\n",
    "simu_start_date = '2012-10-01T00:00:00'                  # Nature run initial date\n",
    "\n",
    "glon = numpy.arange(lon_min, lon_max + dx, dx)\n",
    "glat = numpy.arange(lat_min, lat_max + dy, dy)\n",
    "gtime = numpy.arange(time_min, time_max + dt, dt)\n",
    "\n",
    "# OI parameters\n",
    "Lx = 1.            # Zonal decorrelation scale (in degree)\n",
    "Ly = 1.            # Meridional decorrelation scale (in degree)\n",
    "Lt = 7.            # Temporal decorrelation scale (in days)\n",
    "noise = 0.05       # Noise level (5%)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "### 3- RUN EXPERIMENTS"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Select dc_ref sample"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "dc_ref_sample = dc_ref.sel(time=slice(time_min, time_max)).resample(time='1D').mean()\n",
    "del dc_ref\n",
    "dc_ref_sample"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "##### Define input data observations for each experiment "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "one_nadir = ['./dc_obs/2020a_SSH_mapping_NATL60_jason1.nc']\n",
    "four_nadirs = ['./dc_obs/2020a_SSH_mapping_NATL60_envisat.nc', \n",
    "             './dc_obs/2020a_SSH_mapping_NATL60_geosat2.nc',\n",
    "             './dc_obs/2020a_SSH_mapping_NATL60_topex-poseidon_interleaved.nc',\n",
    "             './dc_obs/2020a_SSH_mapping_NATL60_jason1.nc']\n",
    "one_swot =  ['./dc_obs/2020a_SSH_mapping_NATL60_karin_swot.nc', './dc_obs/2020a_SSH_mapping_NATL60_nadir_swot.nc']"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### - EXP. 1: Demo. OI 1 nadir"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "# set OI param & grid\n",
    "ds_oi1_param = oi_param(Lx, Ly, Lt, noise)\n",
    "ds_oi1_grid = oi_grid(glon, glat, gtime, simu_start_date)\n",
    "# Read input obs + discard a bit...\n",
    "coarsening = {'time': 5}\n",
    "ds_oi1_obs = read_obs(one_nadir, ds_oi1_grid, ds_oi1_param, simu_start_date, coarsening)\n",
    "# Run OI\n",
    "for it in range(len(gtime)):\n",
    "    oi_core(it, ds_oi1_grid, ds_oi1_param, ds_oi1_obs)\n",
    "# Regrid    \n",
    "ds_oi1_regrid = oi_regrid(ds_oi1_grid, dc_ref_sample)\n",
    "# Eval\n",
    "rmse_t_oi1, rmse_xy_oi1, leaderboard_nrmse, leaderboard_nrmse_std = rmse_based_scores(ds_oi1_regrid, dc_ref_sample)\n",
    "psd_oi1, leaderboard_psds_score, leaderboard_psdt_score  = psd_based_scores(ds_oi1_regrid, dc_ref_sample)\n",
    "# Print leaderboard\n",
    "data = [['OI 1 nadir', \n",
    "         leaderboard_nrmse, \n",
    "         leaderboard_nrmse_std, \n",
    "         leaderboard_psds_score, \n",
    "         leaderboard_psdt_score,\n",
    "        'Covariances not optimized',\n",
    "        'quickstart.ipynb']]\n",
    "Leaderboard = pd.DataFrame(data, \n",
    "                           columns=['Method', \n",
    "                                    r\"$\\overline{RMSE_{S}}$\", \n",
    "                                    r\"$\\sigma(RMSE_{S})$\", \n",
    "                                    r'$\\lambda_{x}$ (degree)', \n",
    "                                    r'$\\lambda_{t}$ (days)', \n",
    "                                   'Notes',\n",
    "                                   'Reference'])\n",
    "print(\"Summary of the leaderboard metrics:\")\n",
    "Leaderboard\n",
    "print(Leaderboard.to_markdown())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### - EXP. 2: Demo. OI 4 nadirs"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "# set OI param & grid\n",
    "ds_oi2_param = oi_param(Lx, Ly, Lt, noise)\n",
    "ds_oi2_grid = oi_grid(glon, glat, gtime, simu_start_date)\n",
    "# Read input obs + discard a bit...\n",
    "coarsening = {'time': 5}\n",
    "ds_oi2_obs = read_obs(four_nadirs, ds_oi2_grid, ds_oi2_param, simu_start_date, coarsening)\n",
    "# Run OI\n",
    "for it in range(len(gtime)):\n",
    "    oi_core(it, ds_oi2_grid, ds_oi2_param, ds_oi2_obs)\n",
    "# Regrid    \n",
    "ds_oi2_regrid = oi_regrid(ds_oi2_grid, dc_ref_sample)\n",
    "# Eval\n",
    "rmse_t_oi2, rmse_xy_oi2, leaderboard_nrmse, leaderboard_nrmse_std = rmse_based_scores(ds_oi2_regrid, dc_ref_sample)\n",
    "psd_oi2, leaderboard_psds_score, leaderboard_psdt_score  = psd_based_scores(ds_oi2_regrid, dc_ref_sample)\n",
    "# Print leaderboard\n",
    "data = [['OI 4 nadirs', \n",
    "         leaderboard_nrmse, \n",
    "         leaderboard_nrmse_std, \n",
    "         leaderboard_psds_score, \n",
    "         leaderboard_psdt_score,\n",
    "        'Covariances not optimized',\n",
    "        'quickstart.ipynb']]\n",
    "Leaderboard = pd.DataFrame(data, \n",
    "                           columns=['Method', \n",
    "                                    r\"$\\overline{RMSE_{S}}$\", \n",
    "                                    r\"$\\sigma(RMSE_{S})$\", \n",
    "                                    r'$\\lambda_{x}$ (degree)', \n",
    "                                    r'$\\lambda_{t}$ (days)', \n",
    "                                   'Notes',\n",
    "                                   'Reference'])\n",
    "print(\"Summary of the leaderboard metrics:\")\n",
    "Leaderboard\n",
    "print(Leaderboard.to_markdown())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### - EXP. 3: Demo. OI 1 swot"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "%%time\n",
    "# set OI param & grid\n",
    "ds_oi3_param = oi_param(Lx, Ly, Lt, noise)\n",
    "ds_oi3_grid = oi_grid(glon, glat, gtime, simu_start_date)\n",
    "# Read input obs + discard a bit...\n",
    "coarsening = {'time': 15, 'nC': 6}\n",
    "ds_oi3_obs = read_obs(one_swot[0], ds_oi3_grid, ds_oi3_param, simu_start_date, coarsening)\n",
    "# Important line: vectorize for SWOT like data:\n",
    "ds_oi3_obs = ds_oi3_obs.stack(z=('nC', 'time')).dropna(dim='z')\n",
    "# Run OI\n",
    "for it in range(len(gtime)):\n",
    "    oi_core(it, ds_oi3_grid, ds_oi3_param, ds_oi3_obs)\n",
    "# Regrid    \n",
    "ds_oi3_regrid = oi_regrid(ds_oi3_grid, dc_ref_sample)\n",
    "# Eval\n",
    "rmse_t_oi3, rmse_xy_oi3, leaderboard_nrmse, leaderboard_nrmse_std = rmse_based_scores(ds_oi3_regrid, dc_ref_sample)\n",
    "psd_oi3, leaderboard_psds_score, leaderboard_psdt_score  = psd_based_scores(ds_oi3_regrid, dc_ref_sample)\n",
    "# Print leaderboard\n",
    "data = [['OI 1 swot', \n",
    "         leaderboard_nrmse, \n",
    "         leaderboard_nrmse_std, \n",
    "         leaderboard_psds_score, \n",
    "         leaderboard_psdt_score,\n",
    "        'Covariances not optimized',\n",
    "        'quickstart.ipynb']]\n",
    "Leaderboard = pd.DataFrame(data, \n",
    "                           columns=['Method', \n",
    "                                    r\"$\\overline{RMSE_{S}}$\", \n",
    "                                    r\"$\\sigma(RMSE_{S})$\", \n",
    "                                    r'$\\lambda_{x}$ (degree)', \n",
    "                                    r'$\\lambda_{t}$ (days)', \n",
    "                                   'Notes',\n",
    "                                   'Reference'])\n",
    "print(\"Summary of the leaderboard metrics:\")\n",
    "Leaderboard\n",
    "print(Leaderboard.to_markdown())"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "#### - PLOT EVALUATION SCORES"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rmse_concat = xr.concat((rmse_t_oi1, rmse_t_oi2, rmse_t_oi3), dim='experiment')\n",
    "rmse_concat['experiment'] = [\"1 nadir\", \"4 nadirs\", \"1 SWOT\"]\n",
    "rmse_concat.hvplot.line(x='time', y='rmse_t', by='experiment', ylim=(0, 1), cmap=['royalblue', 'orange', 'lightcoral'], title='RMSE-based scores')"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The figure above represent the RMSE-based score timeseries for the SSH reconstruction with 1 nadir, with 4 nadirs and with 1 SWOT. Several conclusions can be drawn: a) better score is found in the reconstruction with 4 nadirs than with 1 nadir, b) reconstruction with 1 SWOT or with 4 nadir are relatively equivalent, c) the variability of the SWOT score is higher than with 4 nadir, directly linked to the spatio-temporal sampling of the observations (See figure below for the number of observation per day in the OI) "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "nobs_concat = xr.concat((ds_oi1_grid.nobs, ds_oi2_grid.nobs, ds_oi3_grid.nobs), dim='experiment')\n",
    "nobs_concat['experiment'] = [\"1 nadir\", \"4 nadirs\", \"1 SWOT\"]\n",
    "nobs_concat.hvplot.bar(x='time', y='nobs', by='experiment', alpha=0.7, stacked=True, cmap=['orange', 'royalblue', 'lightcoral'], title='# obs in OI')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "rmse_xy_concat = xr.concat((rmse_xy_oi1, rmse_xy_oi2, rmse_xy_oi3), dim='experiment')\n",
    "rmse_xy_concat['experiment'] = [\"1 nadir\", \"4 nadirs\", \"1 SWOT\"]\n",
    "rmse_xy_concat.hvplot.contourf(x='lon', y='lat', levels=list(numpy.arange(0.,0.75, 0.05)), height=300, width=400, cmap='Reds', subplots=True, by='experiment', clabel='RMSE[m]')"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "psd_concat = xr.concat((psd_oi1, psd_oi2, psd_oi3), dim='experiment')\n",
    "psd_concat['experiment'] = [\"1 nadir\", \"4 nadirs\", \"1 SWOT\"]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": [
    "plot_psd_score_v0(psd_concat)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "The PSD-based score evaluates the spatio-temporal scales resolved in the mapping. The resolutions can be defined as the contour where the PSD-score = 0.5, black contour in the figure (i.e., the spatio-temporal scales where the reconstruction SSH error level is 2 times lower than the true SSH signal). The Figure above illustrates the spatio-temporal scales resolved in each experiment. It particularly shows the increase of resolving capability from 1 nadir to 1 SWOT altimeter.   "
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.7"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 4
}
